When did US states close their schools when the COVID pandemic hit?

Visualizing the impact of testing, school policy closing and leadership government party on COVID-19 cases

Dataset & Key Statistics

We combined data from several datasets:

This dataset contains daily updates from 56 states and territories on numbers of positive and negative COVID-19 tests and deaths from March 4 to current date (in total of 33 timepoints). Since 2020/03/16, every day 56 rows, each representing a US state or territory, are added to the existing dataset.

This dataset is published by EduWeek.com, and regards status over time of school closings by district and then by state.

This dataset is published by The Henry J. Kaiser Family Foundation and summarizes state-wide government for each US state.

This dataset is published by the US Census Bureau. It contains the estimation of population by state for the year of 2019.

Merged Dataset Summary:

The first six rows of our merged dataset appears as follows:

Our merged dataset has the following columns and attributes:

    state [chr]: State abbreviation

    date [date]: Date of the data collected

    positive [numeric]: Number of total tests that resulted in positive results reported

    negative [numeric]: Number of total tests that resulted in negative results reported

    total [numeric]: Number of total conducted tests  reported

    death [numeric]: Number of total deaths reported

    Governor.Political.Affiliation[chr]: Party affiliation of the state governor

    StateClosureStartDate [date]: Date state enforced public school closure

    Region [chr]: The region of the US the state is located in

    POPESTIMATE2019 [numeric]: The state's estimated population 

    ClosureDateCat [chr]: The tertile the state's school closure date falls in. The first third of states to close are placed in the "Early" category, the next third in the "Middle" category and the last in the "Late". 

Figure 1: The number of rows recorded daily, where each row in each date corresponds to a state. We use data that has all state data for each day which starts from March 16 to the last time we pulled the data.

Figure 2: The number of states closing public schools for each date

Figure 3: The cumulative number of tests, positive and negative results, and death daily

Biases

State Historical Data in this COVID Tracking Project

Data Source: statewide public health departments. https://covidtracking.com/data

Sources of bias:

State-wide school closure enforcement date from Education Week

Data sources: Local news reporting; National Center for Education Statistics; school/district websites; government websites; clicking through suggests strong reliance on state-level announcements

Sources of bias:

State governor’s political parties from The Henry J. Kaiser Family Foundation

Data source: National Governors Association as of January 3, 2019

Sources of bias:

State population data (estimation for 2019) from the US Census Bureau

Data sources: Census Bureau estimations

Sources of bias:

Visualizations

With the data outlined above, one can create visualizations that illustrate :

Visualization Tasks

  1. Overview via parallel coordinate plots that display states’ school closure date information, total tests performed, governor political party and region

  2. Map that shows geographical relationships via map with ability to hover to reveal state characteristics (i.e. governor political party and state-mandated school closure)

  3. Detailed line plot and heatmap displaying selected entities (listed below) centered around school closure date and showing incubation periods

Five Design Sheet Methodology

Visualization Challenges

General
  • 50 states make for a lot of data points to display all at once. To give an overview in a way that someone can easily see a pattern in the data without it looking very busy is one of our challenges for our visualization.

  • We may find that there is small signal or no pattern in our visualization.

  • Target audience (policymakers) must find visualization user-friendly.

  • We are limited by Plotly’s abilities

Heatmap
  • Difficult to compare rate of change between groupings (regions, political parties)
Line plots
  • No “ordering” of the states
  • Difficult to display state names in a “clean” and organized way
  • Only displays one type of value (y-axis) at a time
Parallel Coordinate Axis Plot
  • Likely will be difficult to follow one observation (a state’s line) across the different axes. How to find a way to make it easier to follow.
Map
  • Limited by packages that focus only on the 50 states. Unable to include the US territory data.

Implementation Plan

We will first merge the COVID tracking data and the school closure data by state; all missing values (i.e., the number of tests/cases) will be empty on the visualization (no imputation will be conducted). The only calculation we need to compute is the rate of change in cases, which will be calculated when the visualization loads. The visualization will be implemented in R, specifically using Shiny and Plotly.

The Parallel Coordinates Plot will be implemented first with the ability to click on the category axis title to sort (and color) each state, including: governor political party, region, time of school closure (grouped as categorical). We implement the parallel coordinate (PC) plot with at least five different axes: states, governor party, region, the closure date, and the current number of cases. Whenever the user selects a category on the dropdown list, both the region colors on the map and the line colors on the PC plot will be updated accordingly.

The map will be implemented next. When hovering over the state, the total number of positive test results will appear.

By default, a line chart will be displayed next to the map showing the individual data and best fit line of the # of cases over time for each category (or the US as a whole). The line chart shows the number of cases by time centered by school closure date, such that the x-axis represents x number of days before and after school closure. A toggle button will be added, enabling the user to toggle between the line chart and a heatmap that indicates the case rate (i.e. % change in cases, or the derivative) over time.

Design Sheets

Since we completed this process virtually, we adapted the design sheet process such that sheets 2,3 and 4 were done in parallel before meeting as a group to combine them for sheet 5. The design sheets are below in slide format.

Sheet 1 is zoomed in so that the font is not too small.

Sheet 1: Brain Storm (done collaboratively)



Sheet 2: Initial Design A (Jon)



Sheet 3: Initial Design B (Kathleen)



Sheet 4: Initial Design C (Chen)



Sheet 5: Realization Design (done collaboratively)



Team Member Contributions

Chen

Jon

Kathleen

Dany

Screenshots

Overall layout of visualization Overall Screenshot

Side bar controls Controls

Parallel Coordinate Plot - Linkage by selecting Democratic States Parallel Coordinate Plot - Linkage by selecting Democratic States

Parallel Coordinate Plot - Comparing Early vs Late Closures Parallel Coordinate Plot - Comparing Early vs Late Closures

Map - showing range of timing of school closures by color Map

Line graph of % change in new deaths, showing trend lines before and after school closure by political party Line Graph

Heat map of % change in new positive cases by state, over time Heat Map

Biases of our visualization

  • Like all datasets reporting number of positives, this is somewhat reflective of total testing as well as true case load.

  • In the early stages of the pandemic, many interventions were implemented at once (school closings, travel plans, social distancing, stay-at-home orders), thus it is difficult to tease out which intervention has the greatest effect.

  • It is difficult to separate cause from effect.

  • There may be some confounding factors that have yet to be identified.

  • Considering an incubation period: there is a delay in any effect from a policy.

Future Work

In the future, we would like to:

Below we show these ideas on a design sheet:

Future Directions Sheet 1

Future Directions Sheet 1

References

  1. The COVID Tracking Project From The Atlantic. Historical state data, downloaded from https://covidtracking.com/api, April 28, 2020.

  2. Education Week. Coronavirus and School Closures: State Data, downloaded from https://www.edweek.org/ew/section/multimedia/map-coronavirus-and-school-closures.html, April 7, 2020.

  3. The Henry J. Kaiser Family Foundation. State Political Parties, downloaded from https://www.kff.org/other/state-indicator/state-political-parties/?currentTimeframe=0&sortModel=%7B%22colId%22:%22Location%22,%22sort%22:%22asc%22%7D, April 26, 2020.

  4. United States Census Bureau. State population data (estimation for 2019), downloaded from https://www2.census.gov/programs-surveys/popest/datasets/2010-2019/state/detail/, May 1, 2020.